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SHORT GCG EXPANSIONS IN THE PAB II GENE FOR OCULO- 
PHARYNGEAL MUSCULAR DYSTROPHY AND DIAGNOSTIC THEREOF 

BACKGROUND OF THE INVENTION 

5 (a) Field of the Invention 

The invention relates to PAB II gene, and its 
uses thereof for the diagnosis , prognosis and treatment 
of a disease related with protein accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 

10 (b) Description of Prior Art 

Autosomal dominant oculopharyngeal muscular dys- 
trophy (OPMD) is an adult -onset disease with a world- 
wide distribution. It usually presents in the sixth 
decade with progressive swallowing difficulties (dys- 

15 phagia) , eye lid drooping (ptosis) and proximal limb 
weakness. Unique nuclear filament inclusions in skele- 
tal muscle fibers are its pathological hallmark (Tome, 
F.M.S. & Fardeau, Acta Neuropath. 49, 85-87 (1980)). We 
isolated the poly (A) binding protein II (PAB II) gene 

20 from a 217 kb candidate interval in chromosome 14qll. A 
(GCG) 6 repeat encoding a polyalanine tract located at 
the N- terminus of the protein was expanded to (GCG) 8-13 
in the 144 OPMD families screened. More severe pheno- 
types were observed in compound heterozygotes for the 

25 (GCG) 9 mutation and a (GCG) 7 allele found in 2% of the 
population, whereas homozygosity for the (GCG) 7 allele 
leads to autosomal recessive OPMD. Thus the (GCG) 7 
allele is an example of a polymorphism which can act as 
either a modifier of a dominant phenotype or as a 

30 recessive mutation. Pathological expansions of the 
polyalanine tract may cause mutated PAB II oligomers to 
accumulate as filament inclusions in nuclei. 

It would be highly desirable to be provided with 
a tool for the diagnosis, prognosis and treatment of a 

35 disease related with polyalanine accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 
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SUMMARY OF THE INVENTION 

One aim of the present invention is to provide a 
tool for the diagnosis, prognosis and treatment of a 
disease related with polyalanine accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 

In accordance with the present invention there 
is provided a human PAB II gene containing transcribed 
polymorphic GCG repeat, which comprises a sequence as 
set forth in Fig. 4, which includes introns and flank- 
ing genomic sequence . 

The allelic variants of GCG repeat of the human 
PAB II gene are associated with a disease related with 
protein accumulation in nucleus, such as polyalanine 
accumulation, or with a disease related with swallowing 
difficulties, such as oculopharyngeal muscular dystro- 
phy. 

In accordance with the present invention there 
is also provided a method for the diagnosis of a dis- 
ease with protein accumulation in nucleus, which com- 
prises the steps of : 

a) obtaining a nucleic acid sample of said patient; 
and 

b) determining allelic variants of GCG repeat of 
the gene of the human PAB II gene, and wherein 
long allelic variants are indicative of a dis- 
ease related with protein accumulation in 
nucleus, such as polyalanine accumulation and 
oculopharyngeal muscular dystrophy. 

The long allelic variants have from about 24 5 to 
about 2 63 bp in length. 

In accordance with the present invention there 
is also provided a non-human mammal model for the PAB 
II gene of the human PAB II gene, whose germ cells and 
somatic cells are modified to express at least one 
allelic variant of the PAB II gene and wherein said 
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allelic variant of the PAB II being introduced into the 
mammal, or an ancestor of the mammal, at an embryonic 
stage . 

In accordance with the present invention there 
is also provided a method for the screening of thera- 
peutic agents for the prevention and/or treatment of 
oculopharyngeal muscular dystrophy, which comprises the 
steps of : 

a) administering said therapeutic agents to the 
non-human mammal of the present invention or 
oculopharyngeal muscular dystrophy patients; and 

b) evaluating the prevention and/or treatment of 
development of oculopharyngeal muscular dystro- 
phy in said mammal or said patients. 

In accordance with the present invention there 
is also provided a method to identify genes part of or 
interacting with a biochemical pathway affected by PAB 
II gene, which comprises the steps of: 

a) designing probes and/or primers using the hGTl 
gene of the PAB II gene and screening oculopha- 
ryngeal muscular dystrophy patients samples with 
said probes and/or primers; and 

b) evaluating the identified gene role in oculopha- 
ryngeal muscular dystrophy patients. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figs. 1A-B illustrate the positional cloning of 
the PAB J J gene; 

Figs. 2A-G illustrate the OPMD (GCG) n expansion 

sizes and sequence of mutations (SEQ ID NOS;l-2); 

Fig. 3 illustrates the age distribution of swal- 
lowing time (st) for French Canadian OPMD carriers of 
the (GCG) 9 mutation; and 
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Fig. 4 illustrates the nucleotide sequence of 
human poly (A) binding protein II (hPAB II) (SEQ ID 
NO: 3) . 

5 DETAILED DESCRIPTION OF THE INVENTION 

In order to identify the gene mutated in OPMD, 
we constructed a 350 kb cosmid contig between flanking 
markers D14S990 and D14S1457 ■ (Fig. 1A) . Positions of 
the PAB II selected cDNA clones in relation to the 

10 EcoRI restriction map and the Genealogy-based Estimate 
of Historical Meiosis ( GEHM ) -derived candidate interval 
(Rommens, J.M. et al . , in Proceedings of the third 
international workshop on the identification of tran- 
scribed sequences (eds. Hochgeschwender, U. & Gardiner, 

15 K.) 65-79 (Plenum, New York, 1994)). 

The human poly (A) binding protein II gene (PAB 
II ) is encoded by the nucleotide sequence as set forth 
in Fig. 4. 

Twenty- five cDNAs were isolated by cDNA selec- 

20 tion from the candidate interval (Rommens, J.M. et al . , 
in Proceedings of the third international workshop on 
the identification of transcribed sequences (eds. 
Hochgeschwender , U . & Gardiner , K. ) 65-79 ( Plenum, New 
York, 1994)). Three of these hybridized to a common 20 

25 kb EcoRI restriction fragment and showed high sequence 
homology to the bovine poly (A) binding protein II 
gene (bPAB II) (Fig. 1A) . The PAB II gene appeared to be 
a good candidate for OPMD because it mapped to the 
genetically defined 0.26 cM candidate interval in 14qll 

30 (Fig. 1A) , its mRNA showed a high level of expression 
in skeletal muscle, and the PAB II protein is exclu- 
sively localized to the nucleus (Krause, S. et al . , 
Exp. Cell Res. 214, 75-82 (1994)) where it acts as a 
factor in mRNA polyadenylatibn (Whale, E., Cell 66, 

35 759-768 (1991); Whale, E . et al . , J". Biol. Chem. 268 , 
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2937-2945 (1993); Bienroth, S. et al . , EMBO J. 12, 585- 
594 (1993) ) . 

We subcloned a 8 kb Hindi I I genomic fragment 
containing the PAB II gene, and sequenced 6002 bp 
5 (GenBank: AF026029) (Nemeth, A. et al . , Nucleic Acids 
Res. 23, 4034-4041 (1995)) (Fig. IB). Genomic structure 
of the PAB II gene, and position of the OPMD (GCG) n 
expansions. Exons are numbered. Introns 1 and 6 are 
variably present in 60% of cDNA clones. ORF, open read- 

10 ing frame; cen, centromere and tel, telomere. 

The coding sequence was based on the previously 
published bovine sequence (GenBank: X89969) and the 
sequence of 31 human cDNAs and ESTs . The gene is com- 
posed of 7 exons and is transcribed in the cen-qter 

15 orientation (Fig. IB) . Multiple splice variants are 
found in ESTs and on Northern blots (Nemeth, A. et al . , 
Nucleic Acids Res. 23., 4034-4041 (1995)). In particu- 
lar, introns 1 and 6 are present in more than 60% of 
clones (Fig. IB) ( Nemeth, A. et al . , Nucleic Acids Res. 

20 23., 4034-4041 (1995)). The coding and protein sequences 
are highly conserved between human, bovine and mouse 
(GenBank: U93 050) . 93% of the PAB II sequence was read- 
ily amenable to RT-PCR- or genomic-SSCP screening. No 
mutations were uncovered using both techniques. How- 

25 ever, a 40 0 bp region of exon 1 containing the start 
codon could not be readily amplified. This region is 
80% GC rich. It includes a (GCG) 6 repeat which codes 
for the first six alanines of a homopolymeric stretch 
of 10 (Fig. 2G) . Nucleotide sequence of the mutated 

30 region of PAB IT. Amino acid sequences of the N-termi- 
nus polyalanine stretch and position of the OPMD ala- 
nine insertions. 

Special conditions were designed to amplify by 
PCR a 242 bp genomic fragment including this GCG- 

35 repeat. The (GCG) 6 allele was found in 98% of French 
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Canadian non-OPMD control chromosomes, whereas 2% of 
chromosomes carried a (GCG) 7 polymorphism (n=86) 
(Brais, B. et al . , Hum. Mol . Genet. 4, 429-434 (1995)). 

Screening OPMD cases belonging to 144 families 
5 showed in all cases a PCR product larger by 6 to 21 bp 
than that found in controls (Fig. 2A) . (GCG) s normal 
allele (N) and the six different (GCG) n expansions 
observed in 144 families. 

Sequencing of these fragments revealed that the 
10 increased sizes were due to expansions of the GCG 
repeat (Fig. 2G) . Fig. 2F shows the sequence of the 
(GCG) 9 French Canadian expansion in a heterozygous par- 
ent and his homozygous child. Partial sequence of exon 
1 in a normal (GCG) g control (N) , a heterozygote (ht.) 
15 and a homozygote (hm. ) for the (GCG) 9 -repeat mutation. 
The number of families sharing the different (GCG) n- 
repeats expansions is shown in Table 1. 

Table 1 

Number of families sharing the different dominant (GCG) n 



OPMD mutations 


Mutations 


Poly alanine 


Families 


(GCG) 8 


12 


4 


(GCG) 9 


13 


99 


(GCG) 10 


14 


19 


(GCG) 1X 


15 


16 


(GCG) 12 


16 


5 


(GCG) 13 


17 


1 


Total 




144 


\ alanine residues in normal PAB 


II . 


The (GCG) 9 


expansion shared 


by 70 French 



dian families is the most frequent mutation we observed 
25 (Table 1) . The (GCG) 9 expansion is quite stable, with a 
single doubling observed in family F151 in an estimated 
598 French Canadian meioses (Fig. 2C) . The doubling of 
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the French Canadian (GCG) 9 expansion is demonstrated in 
Family F151. 

This contrasts with the unstable nature of pre- 
viously described disease-causing triplet-repeats 
5 (Rosenberg, R.N., New Eng. J. Med. 335 . 1222-1224 
(1996) ) . 

Genotyping of all the participants in our clini- 
cal study of French Canadian OPMD provided molecular 
insights into the clinical variability observed in this 

10 condition. The genotypes for both copies of the PAB II 
mutated region were added to an anonymous version of 
our clinical database of 176 (GCG) 9 mutation carriers 
(Brais, B . et al . , Hum. Mol . Genet. 4, 429-434 (1995)). 
Severity of the phenotype can be assessed by the swal- 

15 lowing time (st) in seconds taken to drink 80 cc of 
ice-cold water (Brais, B. et al . , Hum. Mol. Genet. 4, 
429-434 (1995); Bouchard, J. -P. et al . , Can. J. Neural. 
Sci. 19, 296-297 (1992)). The late onset and progres- 
sive nature of the muscular dystrophy is clearly illus- 

20 trated in heterozygous carriers of the (GCG) 9 mutation 
(bold curve in Fig. 3) when compared the average st of 
control (GCG) 6 homozygous participants (n=76 , thinner 
line in Fig. 3) . The bold curve represents the average 
OPMD st for carriers of only one copy of the (GCG) 9 

25 mutation (n=169) , while the thinner line corresponds to 
the average st for (GCG) s homozygous normal con- 
trols (n=76). The black dot corresponds to the st value 
for individual VIII. Roman numerals refer to individual 
cases shown in Figs. 2B f 2D and discussed in the text. 

30 Genotype of a homozygous (GCG) 9 case and her parents 
(Fig. IB). Independent segregation of the (GCG) 7 
allele. Case V has a more severe OPMD phenotype 
(Fig. IB) . 

Two groups of genotypically distinct OPMD cases 
35 have more severe swallowing difficulties. Individuals 
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I, II, and III have an early-onset disease and are 
homozygous for the (GCG) 9 expansion (P < 10" 5 ) 
(Figs. 2B, F) . Cases IV, V, VI and VII have more severe 
phenotypes and are compound heterozygotes for the 
5 (GCG) 9 mutation and the (GCG) 7 polymorphism (P < 10" 5 ) . 
In Fig. 2D the independent segregation of the two 
alleles is shown. Case V, who inherited the French 
Canadian (GCG) 9 mutation and the (GCG) 7 polymorphism, 
is more symptomatic than his brother VIII who carries 

10 the (GCG) 9 mutation and a normal (GCG) 6 allele 
(Figs. 2D and 3). The (GCG) 7 polymorphism thus appears 
to be a modifier of severity of dominant OPMD. Further- 
more, the (GCG) 7 allele can act as a recessive muta- 
tion. This was documented in the French patient IX who 

is inherited two copies of the (GCG) 7 polymorphism and has 
a late-onset autosomal recessive form of OPMD 
(Fig. 2E) . Case IX, who has a recessive form of OPMD, 
is shown to have inherited two copies of the (GCG) 7 
polymorphi sm . 

20 This is the first description of short trinu- 

cleotide repeat expansions causing a human disease. The 
addition of only two GCG repeats is sufficient to cause 
dominant OPMD. OPMD expansions do not share the cardi- 
nal features of 11 dynamic mutations". The GCG expansions 

25 are not only short they are also meiotically quite sta- 
ble. Furthermore, there is a clear cut-off between the 
normal and abnormal alleles, a single GCG expansion 
causing a recessive phenotype. The PAB II (GCG) 7 allele 
is the first example of a relatively frequent allele 

30 which can act as either a modifier of a dominant pheno- 
type or as a recessive mutation. This dosage effect is 
reminiscent of the one observed in a homozygote for two 
dominant synpolydactyly mutations. In this case, the 
patient had more severe deformities because she inher- 

35 ited two duplications causing an expansion in the 
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polyalanine tract of the H0XD13 protein (Akarsu, A.N. 
et al., Hum. Mol . Genet. 5, 945-952 (1996)). A duplica- 
tion causing a similar polyalanine expansion in the a 
subunit 1 gene of the core -binding transcription factor 
5 (CBFccl) has also been found to cause dominant cleido- 
cranial dysplasia (Mundlos, S. et al . , Cell 89, 773-779 
(1997) ) . The mutations in these two rare diseases are 
not triplet -repeats . The are duplications of "cryptic 
repeats" composed of mixed synonymous codons and are 
o thought to result from unequal crossing over (Warren, 
S.T., Science 275, 408-409 (1997)). In the case of 
OPMD, slippage during replication causing a reiteration 
of the GCG codon is a more likely mechanism (Wells, 
D.R., J. Biol. Chem. 271 , 2875-2878 (1996)). 

Different observations converge to suggest that 
a gain of function of PAB II may cause the accumulation 
of nuclear filaments observed in OPMD (Tome, F.M.S. & 
Fardeau, Acta Neuropath. 49., 85-87 (1980)). PAB II is 
found mostly in dimeric and oligomeric form (Nemeth, A. 
et al . , Nucleic Acids Res. 23., 4034-4041 (1995)). It is 
possible that the polyalanine tract plays a role in 
polymerization. Polyalanine stretches have been found 
in many other nuclear proteins such as the HOX pro- 
teins, but their functions is still unknown (Davies, 
S.W. et al., Cell 90, 537-548 (1997)). Alanine is a 
highly hydrophobic amino acid present in the cores of 
proteins. In dragline spider silk, polyalanine 
stretches are thought to form B- sheet structures impor- 
tant in ensuring the fibers' strength (Simmons, A.H. et 
al., Science 271 , 84-87 (1996)). Polyalanine oligomers 
have also been shown to be extremely resistant to 
chemical denaturation and enzymatic degradation 
(Forood, B. et al . , Bioch. and Biophy. Res. Com. 211 , 
7-13 (1995)). One can speculate that PAB II oligomers 
comprised of a sufficient number of mutated molecules 



WO 99/29896 



- 10 - 



PCT/CA98/01133 



might accumulate in the nuclei by forming undegradable 
polyalanine rich macromolecules . The rate of the accu- 
mulation would then depend on the ratio of mutated to 
non-mutated protein. The more severe phenotypes 
5 observed in homozygotes for the (GCG) 9 mutations and 
compound heterozygotes for the (GCG) 9 mutation and 
(GCG) 7 allele may correspond to the fact that in these 
cases PAB II oligomers are composed only of mutated 
proteins. The ensuing faster filament accumulation 

10 could cause accelerated cell death. The recent descrip- 
tion of nuclear filament inclusions in Huntington's 
disease, raises the possibility that "nuclear toxicity 1 ' 
caused by the accumulation of mutated homopolymeric 
domains is involved in the molecular pathophysiology of 

is other triplet-repeat diseases (Davies, S.W. et al . , 
Cell SK), 537-548 (1997); Scherzinger, E. et al . , Cell 
90, 549-558 (1997); DiFiglia, M. et al . , Science 277 , 
1990-1993 (1997) ) . Future immunocytochemical and 
expression studies will be able to test this patho- 

20 physiological hypothesis and provide some insight into 
why certain muscle groups are more affected while all 
tissues express PAB II. 
Methods 

Contig and cDNA selection 

25 The cosmid contig was constructed by standard 

cosmid walking techniques using a gridded chromosome 
14-specific cosmid library (Evans, G.A. et al./ Gene 
79 , 9-20 (1989) ) . The cDNA clones were isolated by cDNA 
selection as previously described (Rommens, J.M. et 

30 al . , in Proceedings of the third international workshop 
on the identification of transcribed sequences (eds. 
Hochgeschwender, U. & Gardiner, K. ) 65-79 (Plenum, New 
York, 1994) ) . 

Cloning of the PAB II gene. Three cDNA clones 

3 5 corresponding to PAB II were sequenced (Sequenase, 
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USB) . Clones were verified to map to cosmids by South- 
ern hybridization. The 8 kb Hindi I I restriction frag- 
ment was subcloned from cosmid 166G8 into pBluescriptll 
(SK) (Stratagene) . The clone was sequenced using prim- 
5 ers derived from the bPABII gene and human EST 
sequences. Sequencing of the PAB II introns was done by 
primer walking. 

PAB II mutation screening and sequencing. All 
caS es were diagnosed as having OPMD on clinical grounds 

10 (Brais, B. et al . , Hum. Mol . Genet. 4, 429-434 (1995)). 
RT-PCR- and genomic SSCP analyses were done using stan- 
dard protocols (Lafreniere, R.G. et al . , Nat. Genet. 
15, 298-302 (1997)). The primers used to amplify the 
PAB II mutated region were : 5 ' -CGCAGTGCCCCGCCTTAGA-3 1 

15 (SEQ ID NO:4) and 5 1 -ACAAGATGGCGCCGCCGCCCCGGC-3 1 (SEQ 
ID NO:5). PCR reactions were performed in a total 
volume of 15 ml containing: 40 ng of genomic DNA; 1.5 
mg of BSA; 1 mM of each primer; 250 mM dCTP and dTTP; 
25 mM dATP; 125 mM of dGTP and 125 mM of 7-deaza-dGTP 

20 (Pharmacia); 7.5% DMSO; 3.75 mCi [35S] dATP, 1.5 unit of 
Taq DNA polymerase and 1.5 mM MgCl 2 (Perkin Elmer). For 
non-radioactive PCR reactions the [35S]dATP was 
replaced by 225 mM of dATP. The amplification procedure 
consisted of an initial denaturation step at 95°C for 

25 five minutes, followed by 35 cycles of denaturation at 
95°C for 15 s, annealing at 70°C for 30 s, elongation 
at 74°C for 30 s and a final elongation at 74°C for 7 
min. Samples were loaded on 5% polyacryl amide denatur- 
ing gels. Following electrophoresis, gels were dried 

30 and autoradiographs were obtained. Sizes of the inserts 
were determined by comparing to a standard Ml 3 sequence 
(Sequenase, USB) . Fragments used for sequencing were 
gel -purified. Sequencing of the mutated fragment using 
the Amplicycle kit (Perkin Elmer) was done with the 5 1 - 
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CGCAGTGCCCCGCCTTAGAGGTG-3 ' (SEQ ID NO : 6) primer at an 
elongation temperature of 68 °C. 

Stability of (GCG) -repeat expansions. The mei- 
otic stability of the (GCG) 9-repeat was estimated based 
5 on our large French Canadian OPMD cohort. We previously 
established that a single ancestral OPMD carrier chro- 
mosome was introduced in the French Canadian population, 
by three sisters in 1648. Seventy of the seventy one 
French Canadian OPMD families tested to date segregate 

10 a (GCG) 9 expansion. However, in family F151, the 
affected brother and sister, despite sharing the French 
Canadian ancestral haplotype, carry a (GCG) 12 expansion 
twice the size of the ancestral (GCG) 9 mutation 
(Fig. 2C) . In our founder effect study, we estimated 

is that 450 (304-594) historical meioses shaped the 123 
OPMD cases belonging to 42 of the 71 enrolled families. 
Our screening of our full set of participants allowed 
us to identify another 14 8 (GCG) 9 carrier chromosomes . 
Therefore, we estimate that a single mutation of the 

20 (GCG) 9 expansion has occurred in 598 (452-742) meioses. 

Genotype -pheno type correlations. 176 carriers of 
at least one copy of the (GCG) 9 mutation were examined 
during the early stage of the linkage study. All were 
asked to swallow 80 cc of ice-cold water as rapidly as 

25 possible. Testing was stopped after 60 seconds. The 
swallowing time (st) was validated as a sensitive test 
to identify OPMD cases (Brais, B. et al . , Hum. Mol . 
Genet. 4, 429-434 (1995); Bouchard, J. -P. et al . , Can. 
J. Neurol. Sci . 19, 296-297 (1992)). The st values for 

30 76 (GCG) 6 homozygotes normal controls is illustrated in 
Fig. 3. Analyses of variance were computed by two-way 
ANOVA (SYSTAT package) . For the (GCG) 9 homozygotes 
their mean st value was compared to the mean value for 
all (GCG) 9 heterozygotes aged 35-40 (P < 10" 5 ) . For the 

35 (GCG) 9 and (GCG) 7 compound heterozygotes their mean st 
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value was compared to the mean value for all (GCG) 9 
heterozygotes aged 45-65 (P < 10' 5 ) . 

While the invention has been described in con- 
nection with specific embodiments thereof, it will be 
5 understood that it is capable of further modifications 
and this application is intended to cover any varia- 
tions, uses, or adaptations of the invention following, 
in general, the principles of the invention and 
including such departures from the present disclosure 
10 as come within known or customary practice within the 
art to which the invention pertains and as may be 
applied to the essential features hereinbefore set 
forth, and as follows in the scope of the appended 
claims . 
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WHAT IS CLAIMED IS; 

1. A human PAB II gene containing transcribed poly- 
morphic GCG repeat, which comprises a sequence as set 
forth in SEQ ID NO : 3 , which includes introns and flank- 
ing genomic sequence. 

2. The gene of claim 1, wherein allelic variants of 
GCG repeat are associated with a disease related with 
protein accumulation in nucleus. 

3. The gene of claim 2, wherein said protein accu- 
mulation is polyalanine accumulation. 

4. The gene of claim 1, wherein allelic variants of 
GCG repeat are associated with a disease related with 
swallowing difficulties. 

5. The gene of claim 1, wherein said disease is 
oculopharyngeal muscular dystrophy. 

6. A method for the diagnosis of a disease with 
protein accumulation in nucleus, which comprises the 
steps of : 

a) obtaining a nucleic acid sample of said patient; 
and 

b) determining allelic variants of GCG repeat of 
the gene of claim 1, and wherein long allelic 
variants are indicative of a disease related 
with protein accumulation in nucleus. 



7. The method of claim 6, wherein said disease is 

oculopharyngeal muscular dystrophy. 
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8. The method of claim 7, wherein said long allelic 
variants have from about 245 to about 263 bp in length. 

9. A non-human mammal model for the PAB II gene of 
claim 1, whose germ cells and somatic cells, are modi- 
fied to express at least one allelic variant of the PAB 
II gene and wherein said allelic variant of the PAB II 
being introduced into the mammal, or an ancestor of the 
mammal, at an embryonic stage. 

10. A method for the screening of therapeutic agents 
for the prevention and/or treatment of oculopharyngeal 
muscular dystrophy, which comprises the steps of: 

a) administering said therapeutic agents to the 
non- human mammal of claim 9 or oculopharyngeal 
muscular dystrophy patients; and 

b) evaluating the prevention and/or treatment of 
development of oculopharyngeal muscular dystro- 
phy in said mammal or said patients. 

11. A method to identify genes part of or interact- 
ing with a biochemical pathway affected by PAB II gene, 
which comprises the steps of : 

a) designing probes and/or primers using the hGTl 
gene of claim 1 and screening oculopharyngeal 
muscular dystrophy patients samples with said 
probes and/or primers; and 

b) evaluating the identified gene role in oculopha- 
ryngeal muscular dystrophy patients. 
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SEQUENCE LISTING 



<110> MCGILL UNIVERSITY 
ROULEAU, Guy A. 
BRAIS , Bernard 



<12 0> SHORT GCG EXPANSIONS IN THE PAG II GENE 

FOR OCULOPHARYNGEAL MUSCULAR DYSTROPHY AND DIAGNOSTIC 
THEREOF 



<130> 1770-199PCT FC/ld 

<150> CA 2,218,199 
<151> 1997-12-09 

<160> 6 

<170> FastSEQ for Windows Version 3.0 



<210> 1 
<211> 57 
<212> DNA 

<213> Artificial Sequence 



<400> 1 

atggcggcgg cggcggcggc ggcagcagca atggcggcgg cggcggcggc ggcggca 

<210> 2 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<400> 2 

maaaaaaaaa agaaggrgsm aaaaaaaaaa agaag 



<210> 3 
<211> 6002 
<212> DNA 

<213> Artificial Sequence 



<400 
aatgaaggtg 
aagcagcaca 
accgcccctc 
gccgtggaca 
tgattgaagg 
gagaagagtg 
tgatttacgg 
agattggagg 
aggcctacgc 
tcctccctgc 
gaaagtaagc 
ctactggtgc 
cacccggcca 
ctggctagtc 
aaggttcacc 
tgagactggg 



> 3 

gacacccaaa 
tctatgtggt 
atctgcaggc 
taggcccact 
gggactacat 
gaaggaaaca 
ttttgagact 
ggtgacattg 
aagagggcgg 
tttctggtgc 
tccctcctgg 
ttcctggtcg 
acagctcact 
atgtgacctc 
tgtcacgaaa 
ccactgcggt 



tagccccaat 
agcatattgc 
gctcacaacc 
tgtcctggga 
gttagaggca 
acatccacaa 
ttacctcgcc 
gaagctgtcc 
gacagacagg 
gggagagcta 
aatgcttcat 
agatacaagt 
agctggcaag 
gggtttccca 
cgagtgtcac 
gaggcgatcg 



acaaatgcct 
caggccgtga 
tagttagcaa 
aatgagggga 
cagactgggt 
agtaaccaca 
agcaaagggg 
aggaaaaaga 
acttgtgact 
gtggatgatg 
tcacaacctc 
ttcctgaaac 
cagtagtatc 
agtttgaagc 
cccttcgact 
gaagattggt 



gttcaatcaa 
gactgcgaat 
acagtaaaac 
agctggggtt 
gcaggtacac 
tgctggcgta 
ggccagtctg 
aaatggaact 
agtagctctg 
gtgccaataa 
cattttcagc 
tgctgctctg 
aagatggcgg 
ccggcagtcc 
ctcgcaagcc 
cctttccagt 



ccaaacatct 
ataaatagga 
aattaagcgc 
tgcagtggtt 
ccaaaggaac 
tcgaaggccg 
ttagcggtgc 
ggggagcaga 
gactgaggaa 
cctggatggg 
aacatcccat 
ttttgggcct 
ccccctagga 
tttcgggggc 
aatcggcatc 
cgcctagcta 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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gggccaatca cggagcgtcc catacttcgc gggcccgccc gtaggccggg gagaagcagg 102 0 

aatatcgtca cagcgtggcg gtattattac ctaaggactc gataggaggt gggacgcgtg 1080 

ttgattgaca ggcagatttc cctaccggga tttgagaatt tggcgcagtg cccgccttag 1140 

aggtgcgctt atttgattgc caagtaatat tccccaatgg agtactagct catggtgacg 1200 

ggcaggcagc ttgagctaat gagtcctccg tggccggcgc agctctccac atgccgggcg 1260 

gcgggcccca gtctgagcgg cgatggcggc ggcggcggcg gcggcagcag cagcgggggc 132 0 

tgcgggcggt cggggctccg ggccggggcg gcggcgccat cttgtgcccg gggccggtgg 13 8 0 

ggaggccggg gagggggccc cggggggcgc aggggactac gggaacggcc tggagtctga 144 0 

ggaactggag cctgaggagc tgctgctgga gcccgagccg gagcccgagc ccgaagagga 1500 

gccgccccgg ccccgcgccc ccccgggagc tccgggccct gggcctggtt cgggagcccc 1560 

cggcagccaa gaggaggagg aggagccggg actggtcgag ggtgacccgg gggacggcgc 1620 

cattgaggac ccggtgagga aggagggcga gcgagcaggc cggcggctgg cgcgtcactg 1680 

gaggcccaga gctcgggcga gcggtggcag gcggggggtg gggttgggcg gggaataacg 174 0 

tggctggggc gggtcgggcc ggggatgggt cagcgatcac tacaaggggc ccgactggct 18 00 

tgattcgggc gtcacgggtg cctagtgttg ttctagagag ggtagctttt cttttatcac 1860 

gaccctcgca tggggcgagg gaaatggccg agcatggctg aggcgcgctc tggccgagag 1920 

cagggcacag cccctgcgtt ggttcctctt aagctgtcct ccataccctc cccacttata 1980 

ttaggagctg gaagctatca aagctcgagt cagggagatg gaggaagaag ctgagaagct 2 04 0 

aaaggagcta cagaacgagg tagagaagca gatgaatatg agtccacctc caggcaatgc 210 0 

tgagtaactg gcggttgcac gcggagcccg ggttctcggg ttggaagggt tgtggggagg 2160 

atggggaatg tggggttaga tactcggcac cctggagctg cttgtctgag ctattatgac 2220 

tgtgccgcgg tcatagtccg ttgtgtgttc ctctgacctt tgtgaggcag aactgatatt 2280 

ttggtggtgg tagccttgtg cctccctttg tcctgttata attgtgttgc tctttattct 2340 

tagtctacgt ctatctttct ttggtagagg ttgcgtgctc gcatttgacc ttcaaatcta 2400 

atagtttttc ctccaattgg agacgcttta ggattctaag agaaagcaag ctggaagggg 2460 

tttccccttt aaattctaga aatgtggagt ctcagcccac ttaattttgc tcactcttaa 2520 

aagcatttca accaaagcca ttcattaggg atttgatttg gagggcagga gggattccta 2580 

tactgtttta agtgtgtatt aattctttca atttatcgaa ttatttagtg agtaacctgc 2640 

tatgcactag gcactattct cggcttgtgg gtacagcagg gaacagcaca gaccaaaatc 2700 

tttgccttca ctgagcttat gggatagtgc tggtggtgga agtgcaacat attggtcaag 2760 

tagaaaacaa gtgtgtggtt tttgtaaaaa attatttttt cctgatagct ggcccggtga 2820 

tcatgtccat tgaggagaag atggaggctg atgcccgttc catctatgtt ggcaatgtga 2 880 

cgtactgggg ctctgactgg ggttgggggc aagttcttct tttggggaat tatttaatag 2 94 0 

tcctgaaaga acatctccgg gatagatgtg gttttgggtg tggagggagt gtgggaagga 3 000 

ggttaaaggt aatggaatga tcagtaatca gcaaaggctc tgggtttgga aggaaaagag 3 060 

attaattcct caaattacca gatttcatgt gctttggtgt atgatggccc agaccaaagg 312 0 

ctcgggaggg ttcttttgag acaggaattt gcctggtgcc tgtgaaattt ttctcctctc 3180 

atcaggtgga ctatggtgca acagcagaag agctggaagc tcactttcat ggctgtggtt 3240 

cagtcaaccg tgttaccata ctgtgtgaca aatttagtgg ccatcccaaa ggtaaagtaa 3300 

aggggagtaa gttgagataa tttaaattac agtgtacaaa tagataaatt atgttttata 3360 

ttgagcagta agttatttgg tgttaacaca ggtgatctgt gtcatttaag atcatggcat 3420 

taatgttgat atatcaggag ttgcacctaa atgtcttcag aggccagata acaaaaatga 3480 

aggctagatg tgggtgggat tacgaactag aaggggaggg gcagcttcta cttggcctat 3540 

tatggcatat ggaaattcag gccctgtgtg tcttattttt acaaatttca aagagtagct 3600 

ggaaatttta aaatttaaat gatttcgaat gattgaaatt ttccatttag aagaattttg 3660 

acaaataaaa aatataactg cattgtagcc caaaacgaag catgcctgca ggttgaattt 372 0 

gacctgtgag gtatttgtaa cctcagagag atacaatgac aattcttttc aggtttgcgt 3780 

atatagagtt ctcagacaaa gagtcagtga ggacttcctt ggccttagat gagtccctat 3840 

ttagaggaag gcaaatcaag gtaagcctat gtccattgct gttctagttg tgtataaact 3900 

ctccaggttg cctttaaggc tatcatttgt tcatctctga ctcaggtgat cccaaaacga 3 960 

accaacagac caggcatcag cacaacagac cggggttttc cacgagcccg ctaccgcgcc 4020 

cggaccacca actacaacag ctcccgctct cgattctaca gtggttttaa cagcaggccc 4080 

cggggtcgcg tctacaggtc aggatagatg ggctgctcct ctttcccccg cctcccgtga 4140 

gccccgtatg cttcctcctc tctggtctga ggaacctccc tccccccacc cctccccgtg 4200 

gtcttcagga actttgtctc ctgcctgtgc aggttgagga aggtagttgc aggccaggcc 4260 

agaaggcagc ctcatcatct tttctgcagt agaaattggt gataagggct gcatccctcc 4320 

cttggttcaa agaggcttcc acccccagcc ttttttttct tgggagttgg tggcatttga 43 80 
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aggtgtttgc 
gacatttgtt 
ataccagagg 
attctttatt 
ggtttttgcc 
cttggctttt 
aggccctagg 
ccttgcccct 
tgttaagccc 
ggggtcggtt 
gcaggggcct 
tctccttctt 
aaagtgtgta 
taaaaaaaaa 
aaaagatata 
ttggggagta 
tcgccatgga 
ccccttgggc 
ggctctggaa 
tttcacagtc 
tttctccggc 
tctttttctg 
gctcccagcg 

ggggggttta 

tttgcctttt 
ggtggatttt 
gtcatgaata 
aa 



ggacaaaact 
acttttttcg 
ctagctagtt 
tgagccagtc 
ttgcttcact 
cataagctct 
gtttaaaaac 
gtctctcact 
ccctccccct 
ttaggacact 
acggggaggg 
tcttccaggg 
ttaggaggag 
aaaaagaaaa 
ctgtggaagg 
ggggaaggcc 
cacgtctcaa 
ctgctcaagg 
ggacaccaaa 
ccctcctgcc 
tccctgcccc 
ttttgagtgt 
gctccagtgt 
ggggtgtttt 
ttccctttta 
gtttattttt 
aagttgtttt 



gggaggaaca 
gagttaggga 
gatcctccca 
ttgcaaggtt 
tctgtctcta 
acctgcctat 
tgtggaggac 
cagatgcgct 
gccccagttc 
tgaacacttc 
gcttgtactg 
gccgggctag 
agagaggaaa 
acagaagatg 
ggggagaatc 
cagggagtgg 
ctgcgcaagc 
gtaggtgggc 
ctgttctgct 
tgctcctgtc 
tccagattgc 
ctttctttgc 
aaattcccct 
tgtttttcag 
tttggaggga 
ttagctcatt 
tgaaaataaa 



gggcctccag 
gggattgaag 
acagccttgt 
aacttctcac 
catttaaata 
ccccaggagt 
tgaaaaactg 
tctttttcgc 
tcccaggtgc 
ttttcccccc 
aactatctag 
agcgacatca 
aaaagaggaa 
accttgatgg 
ccataactaa 
ggcagggggc 
tgcttgccca 
gtgggtggta 
tgttaccttc 
cagccaggtc 
ctggtgatct 
aggtttctgt 
tccccctggg 
ttgttttgtt 
atgggaggaa 
tccaggggtg 
aaaaaaaaaa 



gaagttgaaa 
actgaacctc 
gggaggattt 
tgggcctagt 
gacgggttag 
tagggaggat 
gataaaaagg 
cactgtttgg 
gttactattt 
ttcccttcac 
tgatcacgtt 
tggtattccc 
agaaggaaaa 
aaaaaaaata 
ctgctgagga 
tgcttattca 
tgtttccctg 
ggagggtttt 
cctcccgtct 
taccacccac 
attttgtttc 
agccggaaga 
gaaatgcact 
tttttgtttt 
gtgggaacag 
ggaatttttt 
aaaaaaaaaa 



gcactgcttg 
ccttggaaga 
tgagatactt 
gtggtnccca 
gcatataaac 
ctatttgtga 
gggtcctttt 
caaagttttc 
ctgggatcat 
agtaactggg 
aacacctaac 
cttactaaaa 
aaaaaagaat 
ttttttaaaa 
gggacctgct 
ctctggggat 
cccccttcac 
ttttacccag 
tctcctcgcc 
cccacccctc 
cttttgtgtt 
tctccgttcc 
accttgtttt 
ttttttttcc 
ggaggtggga 
tttaatatgt 
aaaaaaaaaa 



4440 
4500 
4560 
4620 
4680 
4740 
4800 
4660 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6002 



<210> 4 
<211> 19 
<212> DNA 

<213> Artificial Sequence 

<400> 4 
cgcagtgccc cgccttaga 

<210> 5 
<211> 24 
<212> DNA 

<213> Artificial Sequence 

<400> 5 
acaagatggc gccgccgccc cggc 

<210> 6 

<211> 23 

<212> DNA 

<213> Artificial Sequence 



<400> 6 
cgcagtgccc cgccttagag gtg 
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